Query-based sampling of text databases
نویسندگان
چکیده
منابع مشابه
Modeling Query-Based Access to Text Databases
Searchable text databases abound on the web. Applications that require access to such databases often resort to querying to extract relevant documents because of two main reasons. First, some text databases on the web are not “crawlable,” and hence the only way to retrieve their documents is via querying. Second, applications often require only a small fraction of a database’s contents, so retr...
متن کاملContent-based query of image databases: inspirations from text retrieval
In this paper we report the application of techniques inspired by text retrieval research to the content-based query of image databases. In particular, we show how the use of an inverted le data structure permits the use of a feature space of O(10 4) dimensions, by restricting search to the subspace spanned by the features present in the query. A suitably sparse set of colour and texture featur...
متن کاملRelational Databases Query Optimization using Hybrid Evolutionary Algorithm
Optimizing the database queries is one of hard research problems. Exhaustive search techniques like dynamic programming is suitable for queries with a few relations, but by increasing the number of relations in query, much use of memory and processing is needed, and the use of these methods is not suitable, so we have to use random and evolutionary methods. The use of evolutionary methods, beca...
متن کاملAutomatic Classification of Text Databases Through Query Probing
Many text databases on the web are “hidden” behind search interfaces, and their documents are only accessible through querying. Search engines typically ignore the contents of such search-only databases. Recently, Yahoo-like directories have started to manually organize these databases into categories that users can browse to find these valuable resources. We propose a novel strategy to automat...
متن کاملQuery-Based Sampling using Snippets
Query-based sampling is a commonly used approach to model the content of servers. Conventionally, queries are sent to a server and the documents in the search results returned are downloaded in full as representation of the server’s content. We present an approach that uses the document snippets in the search results as samples instead of downloading the entire documents. We show this yields eq...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: ACM Transactions on Information Systems
سال: 2001
ISSN: 1046-8188,1558-2868
DOI: 10.1145/382979.383040